\section{Related work}\label{sec:related}

Incremental view maintenance~\cite{gupta-sigmod93, griffin-sigmod95, chaudhuri-icde95,
gupta-idb95, chirkova-book12} is a much studied problem in databases.
A survey of results for Datalog queries is present in~\cite{motik-ai19}.
The standard approach is as follows: given a query $Q$, discover a ``delta query'',
a ``differential'' version $\Delta Q$ that satisfies the equation:
$Q(d+\Delta d)=Q(d)+\Delta Q(d,\Delta d)$, and which can be used to compute
the change for a new input reusing the previous output.
DBToaster introduced recursive recursive IVM~\cite{ahmad-vldb09, koch-pods10}, where
the incrementalization process is repeated for the delta query.

Many custom algorithms were published for various classes of queries: e.g.~\cite{koch-pods16}
handles positive nested relational calculus.  DYN~\cite{idris-sigmod17}
and IDYN~\cite{idris-vldb18, idris-sigmod19} focus on acyclic conjunctive queries.  Instead
of keeping the output view materialized they build data structures that allow efficiently
querying the output views.  PAI maps~\cite{abeysinghe-sigmod22} are specially
designed for queries with correlated aggregations.
AJU~\cite{wang-sigmod20} focuses on foreign-key joins.  It is a matter
of future work to evaluate whether custom \dbsp operators
can match the efficiency of systems specialized for narrow classes
of queries.

\dbsp is a bottom-up system, which always produces eagerly
the \emph{changes} to the output views.
Instead of maintaining the output view entirely, \dbsp proposes
generating deltas as the output of the computation (similar to the kSQL~\cite{jafarpour-edbt19}
\texttt{EMIT CHANGES} queries).  The idea that both
inputs and outputs to an IVM system are streams of changes
seems trivial, but this is key to the symmetry of our solution:
both in our definition of IVM~(\ref{def:inc}), and the fundamental
reason that the chain rule exists --- the chain rule is the one that makes our
structural induction IVM algorithm possible.

Several IVM algorithms for Datalog-like languages use counting based
approaches~\cite{Dewan-iis92,motik-aaai15} that maintain the number of derivations of each
output fact: DRed~\cite{gupta-sigmod93} and its variants~\cite{Ceri-VLDB91,Wolfson-sigmod91,
Staudt-vldb96,Kotowski-rr11,Lu-sigmod95,Apt-sigmod87}, the backward-forward algorithm
and variants~\cite{motik-aaai15,Harrison-wdd92,motik-ai19}.
\dbsp is more general,
and our incrementalization algorithm handles arbitrary recursive queries and
generates more efficient plans for recursive queries
in the presence of arbitrary updates (especially deletions, where competing approaches
may over-delete).  Interestingly, the \zrs weights in \dbsp are related
to the counting-number-of-derivations approaches, but our use of the $\distinct$
operator shows that precise counting is not necessary.

Picallo et al.~\cite{picallo-scop19} provide a general solution to IVM for
rich languages.  \dbsp requires a group structure on the values operated on;
this assumption has two major practical benefits: it simplifies the mathematics considerably
(e.g., Picallo uses monoid actions to model changes), and it provides a general, simple
algorithm (\ref{algorithm-inc}) for incrementalizing arbitrary programs.  The downside of
\dbsp is that one has to find a suitable group structure (e.g., \zrs for sets) to ``embed''
the computation.  Picallo's notion of ``derivative'' is not unique: they need creativity to choose
the right derivative definition, we need creativity to find the right group structure.

Finding a suitable group structure has proven easy for relations (both~\cite{koch-pods10}
and~\cite{green-tcs11} use \zrs to uniformly model data and insertions/deletions), but it is
not obvious how to do it for other data types, such as sorted collections, or tree-shaped
collections (e.g., XML or JSON documents)~\cite{foster-planx08}.  An intriguing question
is ``what other interesting group structures could this be applied to besides \zrs?''
Papers such as~\cite{nikolic-icmd18} explore other possibilities, such as matrix algebra,
linear ML models, or conjunctive queries.

\dbsp does not do anything special for triangle queries~\cite{kara-tds20}.  Are there
better algorithms for this case?

In \secref{sec:extensions} we have briefly mentioned that \dbsp can easily
model window and stream database queries~\cite{arasu-tr02,aurora}; it is an
interesting question whether there are CQL queries that cannot be expressed in \dbsp
(we conjecture that there aren't any).

\citet{bonifati-iclp2018} implemented a verified IVM algorithm for a particular
class of graph queries called Regular Datalog, with an implementation machine-checked in the
Coq proof assistant. Their focus is on a particular algorithm and the approach does not
consider other SQL operators, general recursion, or custom operators (although it is modular
in the sense that it works on any query by incrementalizing it recursively). Furthermore,
for all queries a deletion in the input change stream requires running the non-incremental
query to recover.  We formally verify the theorems in our paper, which
are much broader in scope, but not our implementations.

\dbsp is also related to Differential Dataflow (DD)~\cite{mcsherry-cidr13, murray-sosp13}
and its theoretical foundations~\cite{abadi-fossacs15} (and recently~\cite{mcsherry-vldb20,chothia-vldb16}).
DD's computational model is more powerful than
\dbsp, since it allows time values to be part of an arbitrary lattice.
In fact, DD is the only other framework which we are aware of that can incrementalize
recursive queries as efficiently as \dbsp does.
In contrast, our model uses either ``linear'' times, or nested time dimensions via the modular lifting transformer ($\lift{}$).
\dbsp can express both
incremental and non-incremental computations.  Most importantly, \dbsp comes with Algorithm~\ref{algorithm-inc}, a syntax-directed translation that can convert
any expressible query into an incremental version --- in DD users have
to assemble incremental queries manually using incremental operators.
(materialize.com offers a product that automates incrementalization
for SQL queries based on DD.  Differential Datalog~\cite{ryzhyk-datalog19}
does it for a Datalog dialect.)  Unlike DD, \dbsp is a modular theory,
which easily accommodates the addition of new operators:  as long as we can
express a new operator as a \dbsp circuit, we can (1) define its incremental version,
(2) apply the incrementalization algorithm to obtain an efficient
incremental implementation, and (3) be confident that it composes with any
other operators.
